-
Notifications
You must be signed in to change notification settings - Fork 1.3k
CSHARP-5572: Implement new KnownSerializerFinder #1700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
hi @rstam, what's the status of this effort? |
It is progressing well, but it is a large effort. I work on this whenever I'm not working on some smaller higher priority task. |
86fbb24 to
56a464c
Compare
| } | ||
| } | ||
|
|
||
| if (_map.TryGetValue(node, out var existingSerializer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move the duplicate check to the beginning of the AddSerializer method to implement fail-fast behavior. This would avoid unnecessary serializer transformation work when a serializer already exists for the node.
However, I notice the current implementation places this check at the end, likely to preserve information about the conflicting serializer for the exception. While this debugging information could be helpful, I question just how useful it is. Knowing where a duplicate attempt happened seems more useful knowing what the conflicting serializer would have been.
I lean toward the fail-fast approach since the conflicting serializer details seem like a "nice-to-have" rather than essential debugging information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move the error check further up then we have to remove information about the redundant serializer from the error message.
Also, I don't think we need to worry about failing fast because we actually never expect to hit this error. This check is here to help us find bugs. In the absence of bugs this exception will never be thrown.
| private bool _isMakingProgress = true; | ||
| private readonly KnownSerializerMap _knownSerializers; | ||
| private int _oldKnownSerializersCount = 0; | ||
| private int _pass = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this is never used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
| var newKnownSerializersCount = _knownSerializers.Count; | ||
| if (newKnownSerializersCount == _oldKnownSerializersCount) | ||
| { | ||
| if (_useDefaultSerializerForConstants) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we maybe make use of a state enum as such:
private enum DiscoveryPhase
{
InferringFromContext, // Not using defaults
UsingDefaultSerializers, // Last attempt with defaults
Complete // No more progress possible
}
to replace the use of all these booleans used in keeping track of the state of the visitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's somewhat reasonable, but I think the booleans work better here.
There are only 2 booleans.
One issue is that if we ever had more states the order of the enum values could get very confusing and maybe even no order is possible.
32aae48 to
7f2a64c
Compare
sanych-sun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review.
src/MongoDB.Driver/Linq/Linq3Implementation/KnownSerializerFinders/KnownSerializerFinder.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/KnownSerializerFinders/KnownSerializerMap.cs
Outdated
Show resolved
Hide resolved
...iver/Linq/Linq3Implementation/KnownSerializerFinders/KnownSerializerFinderVisitMethodCall.cs
Outdated
Show resolved
Hide resolved
...MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderVisitMethodCall.cs
Show resolved
Hide resolved
...MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderVisitMethodCall.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderHelperMethods.cs
Show resolved
Hide resolved
...ver/Linq/Linq3Implementation/KnownSerializerFinders/KnownSerializerFinderVisitConditional.cs
Outdated
Show resolved
Hide resolved
...DB.Driver/Linq/Linq3Implementation/KnownSerializerFinders/KnownSerializerFinderVisitIndex.cs
Outdated
Show resolved
Hide resolved
|
|
||
| bool IsTupleOrValueTuple(Type type) | ||
| { | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For net6 and higher there is a better way to check: both Tuple and ValueTuple implements ITuple interface. I suppose we can have #if def here and use interface approach for modern targets, when have something like the proposed code for net472.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to know!
Is there any danger of false positives if unexpected classes also happen to implement ITuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not doing just yet until we have a discussion about false positives.
src/MongoDB.Driver/Linq/Linq3Implementation/SerializerFinders/SerializerFinderVisitUnary.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a new KnownSerializerFinder system to replace the previous approach of dynamically determining serializers during expression translation. The new system pre-analyzes expressions to build a map of known serializers for each expression node, enabling more reliable and consistent serializer resolution throughout the translation process.
Key Changes:
- Introduced a comprehensive
KnownSerializerFinderinfrastructure that pre-computes serializers for expression nodes - Updated
TranslationContextto accept and use the pre-computed serializer map - Refactored multiple translators to leverage the new known serializer system
- Added several new serializer types to support the enhanced type inference
Reviewed changes
Copilot reviewed 98 out of 99 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
KnownSerializerFinder*.cs |
New visitor-based system for discovering and mapping serializers to expression nodes |
TranslationContext.cs |
Extended to store and provide access to known serializers map |
LinqProviderAdapter.cs |
Updated all translation entry points to use the new KnownSerializerFinder |
*Serializer.cs |
Added new serializer types (Polymorphic, Upcasting, Unknowable, etc.) to support enhanced type inference |
| Test files | Updated test contexts to use new TranslationContext.Create signatures |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| DeduceSerialiers(); | ||
| base.VisitListInit(node); | ||
| DeduceSerialiers(); | ||
|
|
||
| return node; | ||
|
|
||
| void DeduceSerialiers() |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'DeduceSerialiers' to 'DeduceSerializers'.
| DeduceSerialiers(); | |
| base.VisitListInit(node); | |
| DeduceSerialiers(); | |
| return node; | |
| void DeduceSerialiers() | |
| DeduceSerializers(); | |
| base.VisitListInit(node); | |
| DeduceSerializers(); | |
| return node; | |
| void DeduceSerializers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| DeduceConditionaSerializers(); | ||
| base.VisitConditional(node); | ||
| DeduceConditionaSerializers(); | ||
|
|
||
| return node; | ||
|
|
||
| void DeduceConditionaSerializers() |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'DeduceConditionaSerializers' to 'DeduceConditionalSerializers'.
| DeduceConditionaSerializers(); | |
| base.VisitConditional(node); | |
| DeduceConditionaSerializers(); | |
| return node; | |
| void DeduceConditionaSerializers() | |
| DeduceConditionalSerializers(); | |
| base.VisitConditional(node); | |
| DeduceConditionalSerializers(); | |
| return node; | |
| void DeduceConditionalSerializers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| return sourceSerializer; | ||
| } | ||
|
|
||
| // handle conversionsn to BsonValue before any others |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'conversionsn' to 'conversions'.
| // handle conversionsn to BsonValue before any others | |
| // handle conversion to BsonValue before any others |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| _knownSerializers = knownSerializers; | ||
| } | ||
|
|
||
| public Expression ExpressionWithUnknownSerialier => _expressionWithUnknownSerializer; |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'ExpressionWithUnknownSerialier' to 'ExpressionWithUnknownSerializer'.
| public Expression ExpressionWithUnknownSerialier => _expressionWithUnknownSerializer; | |
| public Expression ExpressionWithUnknownSerializer => _expressionWithUnknownSerializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…pport $filter with limit : 1
| } | ||
| } | ||
| """; | ||
| if (!Feature.FilterLimit.IsSupported(CoreTestConfiguration.MaxWireVersion)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually create two strings depending on whether a feature is supported or not but, in this case, the MQL is so long it's easier to just remove limit : 1 if it is not supported.
sanych-sun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also review all TODOs (remove, address or create a ticket for future if we can postpone the changes) and failing unit-test.
| } | ||
|
|
||
| if (IsSymmetricalBinaryOperator(@operator) && | ||
| CanDeduceSerializer(leftExpression, rightExpression, out var unknownNode, out var otherNodeSerializer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I found only 2 usages of CanDeduceSerializer, both looks similar (TODO in the another usage says we have to check value types). I think we can simplify by refactoring to TryDeduceSerializer and do the type check and AddNodeSerializer inside it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the CanDeduceSerializer method entirely becase there was another existing method that could be used.
| } | ||
| else | ||
| { | ||
| DeduceUnknowableSerializer(node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It worth to add a comment why we decided stop looking for the serializer for the node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| } | ||
| } | ||
|
|
||
| internal class IgnoreNodeSerializer<TValue> : SerializerBase<TValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the difference between IgnoreNodeSerializer and UnknowableSerializer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IgnoreSerializer is when we know what the node means but there is no serializer needed for the node. But since we throw an exception if a node has no serializer we assign this dummy serializer to such nodes.
UnknowableSerializer is when we know what the node means in general, but not in this particular case. An example would be a method call expression with an unknown method (we know about method call expressions in general but can encounter methods we don't know anything about).
UnknowableSerializer comes into play with client-side projections. If we didn't assign the UnknowableSerializer dummy serializer to method call nodes we didn't know about we would throw a missing serializer exception which would prevent us from getting to the client-side projection.
| protected override Expression VisitListInit(ListInitExpression node) | ||
| { | ||
| var newExpression = node.NewExpression; | ||
| var initializers = node.Initializers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializers variable seems to be unused. Should we handle initializers somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure yet. Added a TODO comment for later investigation.
| var containingExpression = node.Expression; | ||
| if (IsKnown(containingExpression, out containingSerializer)) | ||
| { | ||
| // TODO: handle special cases like DateTime.Year etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still actual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I don't know the full list.
I've edited the comment to not mention DateTime.Year (since that one is already done)
|
|
||
| bool IsDictionaryContainsKeyMethod(out Expression keyExpression) | ||
| { | ||
| if (method.DeclaringType.Name.Contains("Dictionary") && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this method to DictionaryMethod helper class and avoid Contains("Dictionary") by checking for interface or base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
|
||
| void DeduceAggregateMethodSerializers() | ||
| { | ||
| if (method.IsOneOf(__aggregateMethods)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find any usage of results of IsItemSerializerKnown method call results, so it probably could be removed. And if it's removed then outer if is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this one line.
I don't understand why you think the outer if is redundant.
| } | ||
| } | ||
|
|
||
| void DeduceAsinhMethodSerializers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have number of exactly the same methods in this class to process all math trigonometry methods, should we group then into a single collection __trigonometricMathMethods and process in the single method like:
void DeduceTrigonometryMethodSerializers()
{
if (method.Is(__trigonometricMathMethods))
{
DeduceReturnsDoubleSerializer();
}
else
{
DeduceUnknownMethodSerializer();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
| var isNullable = toType.IsNullable(); | ||
| var valueType = isNullable ? Nullable.GetUnderlyingType(toType) : toType; | ||
|
|
||
| var valueSerializer = (IBsonSerializer)(Type.GetTypeCode(valueType) switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic looks very similar to StandardSerializers.GetSerializer. Can we leverage it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a TODO for future investigation.
There are 11 types handled here that StandardSerializers doesn't know about and a few of the ones it does know about are configured differentlly here, so it's tricky.
|
|
||
| case ExpressionType.NewArrayInit: | ||
| DeduceNewArrayInitSerializers(); | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we throw in default case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I don't think it's necessary.
First of all, there are no other cases here AFAIK.
And if there are, it will result in a missing serializer exception a bit later on.
I don't think we normally throw exceptions when deducing things in the SerializerFinder. We either deduce something or we do nothing.
rstam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments and requested changes.
| } | ||
|
|
||
| if (IsSymmetricalBinaryOperator(@operator) && | ||
| CanDeduceSerializer(leftExpression, rightExpression, out var unknownNode, out var otherNodeSerializer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the CanDeduceSerializer method entirely becase there was another existing method that could be used.
| { | ||
| _ when declaringType == typeof(BsonValue) => GetBsonValuePropertySerializer(), | ||
| _ when IsCollectionCountOrLengthProperty() => GetCollectionCountOrLengthPropertySerializer(), | ||
| _ when declaringType == typeof(DateTime) => GetDateTimePropertySerializer(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not until we add support for DateTimeOffset in general.
No tests are failing because we didn't do this.
| var containingExpression = node.Expression; | ||
| if (IsKnown(containingExpression, out containingSerializer)) | ||
| { | ||
| // TODO: handle special cases like DateTime.Year etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I don't know the full list.
I've edited the comment to not mention DateTime.Year (since that one is already done)
|
|
||
| return memberName switch | ||
| { | ||
| "Keys" => ICollectionSerializer.Create(keySerializer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is surprising.
It's because Dictionary<TKey, TValue>.Keys and IDictionary<TKey, TValue>.Keys have different types. Same for Values.
An unexpected glitch in Microsoft's design of these types. (Constrained by having to maintain backward compatibility).
|
|
||
| internal partial class SerializerFinderVisitor | ||
| { | ||
| private static readonly HashSet<MethodInfo> __absMethods = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to MathMethod.cs from where it can be shared.
| } | ||
| } | ||
|
|
||
| void DeduceAsinhMethodSerializers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
| var isNullable = toType.IsNullable(); | ||
| var valueType = isNullable ? Nullable.GetUnderlyingType(toType) : toType; | ||
|
|
||
| var valueSerializer = (IBsonSerializer)(Type.GetTypeCode(valueType) switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a TODO for future investigation.
There are 11 types handled here that StandardSerializers doesn't know about and a few of the ones it does know about are configured differentlly here, so it's tricky.
|
|
||
| bool IsDictionaryContainsKeyMethod(out Expression keyExpression) | ||
| { | ||
| if (method.DeclaringType.Name.Contains("Dictionary") && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
|
||
| case ExpressionType.NewArrayInit: | ||
| DeduceNewArrayInitSerializers(); | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I don't think it's necessary.
First of all, there are no other cases here AFAIK.
And if there are, it will result in a missing serializer exception a bit later on.
I don't think we normally throw exceptions when deducing things in the SerializerFinder. We either deduce something or we do nothing.
| } | ||
| } | ||
|
|
||
| internal class IgnoreNodeSerializer<TValue> : SerializerBase<TValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IgnoreSerializer is when we know what the node means but there is no serializer needed for the node. But since we throw an exception if a node has no serializer we assign this dummy serializer to such nodes.
UnknowableSerializer is when we know what the node means in general, but not in this particular case. An example would be a method call expression with an unknown method (we know about method call expressions in general but can encounter methods we don't know anything about).
UnknowableSerializer comes into play with client-side projections. If we didn't assign the UnknowableSerializer dummy serializer to method call nodes we didn't know about we would throw a missing serializer exception which would prevent us from getting to the client-side projection.
No description provided.